Module 2: Correlation & Simple Linear Regression

PSYC 3032 M

Udi Alter

About Module 2

Module 2’s topics relate to modelling (linear) relationships between variables to help address interesting questions like…

  • How does education affect earnings?

  • To what extent listening to ska music relates to dressing in black and white checkered clothing articles?

  • How strong is the relationship between taking PSYC3032 and being a billionaire?

    • (nobody said the relationship is positive…)
  • What is the association between OCD and depression?

  • Does exercise influence psychological brain states, such as depression or anxiety?




Correlation and Simple Linear Regression

Prologue: Correlation vs. Regression

Before we dive into each topic separately, it’s useful to put both in the right context.

Correlation and simple regression are often used interchangeably, but there are key conceptual differences (but not mathematical) between them.

  • Correlation describes the strength of the (primarily linear) relationship or association between two variables

    • As one variable changes, what happens to another variable? Does it go up as well (positive correlation)? Down (negative correlation)? Seem anaffected (no correlation)?
  • Correlation is used mainly as a descriptive statistic, to quantify an association, but NOT saying anything about causation

  • Though, as it turns out, the math required to obtain correlation estimates requires the same information used in…..you guessed it, simple linear regression!

Prologue: Correlation vs. Regression

  • Simple regression, on the other hand, is about predicting or explaining an outcome/dependent variable using an independent/explanatory variable
    • Has more of a causal (or at minimum, suggestive) nature
  • Useful for experimental, quasi-experimental, and otherwise predictive empirical questions
  • In psychology, we often deal with observational data, meaning that we don’t manipulate the IVs directly (i.e., we don’t manipulate one’s education or earnings directly; we only passively record it)
    • Experimental studies do however try to manipulate the IVs explicitly, and we’ll talk much more about this later

Prologue: Correlation vs. Regression

Note that correlation is cause-blind (association\(\neq\)causation), we often graph the relationship with double-sided arrows (i.e., we don’t know/care about why they relate, we just know they vary together)

Regression models, on the other hand, are necessarily directional (one-sided arrow), meaning, we make a statement/assumption about what causes/affects what (e.g., X2 leads to Y2)




Covariance and Correlation

(Co)variance

  • Before we define correlation, which is, in fact, a standardized effect size measure, we should first talk about covariance, the unstandardized sibling of correlation

  • By now you should know that each variable has its own variance—which describes the spread of the individual observations on that particular variable

\[VAR(X)=\frac{\sum (x_i-\bar{x})^2}{N-1}= \frac{\sum (x_i-\bar{x})(x_i-\bar{x})}{N-1} \]

  • where \(x_i\) is a particular observation’s score on X, \(\bar{x}\) is the mean of X, and \(N\) is the sample size.

  • The numerator represents the sum of squares (i.e., the sum of squared deviation scores from the mean)

Covariance

  • If variance is how one variable varies (alone, with itself), then covariance is one variable varies with another…Thus, the formula of covariance between two variables is similar to that of variance (but, instead of X twice, we have X and Y):

\[COV(X, \ Y)= \frac{\sum (x_i-\bar{x})(y_i-\bar{y})}{N-1}\]

  • This measure describes how much the variables co-vary together; the covariance gives us a measure of how these two variables \(X\) and \(Y\) are associated

  • If Y tends (i.e., on average) to be above its mean when X is above its respective mean then \(COV(X, Y) = +\); if Y tends to be above its mean when X is below its respective mean then \(COV(X, Y) = -\); when \(COV(X, Y ) = 0\), we say that X and Y are independent or orthogonal of one another

  • Covariance is an important statistic, but because it’s in some hybrid metric (the product of the units of X with the unit of Y), it’s hard to to gauge its magnitude

    • E.g., covariance of age and height might be in the ballpark of \(40 \ cm \times year\) (what does it even mean? Is it large, small, meh?)

Correlation

With this definition of covariance we can now define Pearson’s correlation parameter

\[\rho = \frac{COV(X,Y)}{SD(X) \cdot SD(Y)}\]

  • where \(SD(X)\) and \(SD(Y)\) are the standard deviation of X and Y, respectively.

  • Dividing by the standard deviations of X and Y removes both the metrics, thereby standardizing the covariance and making it in a comprehensible metric

  • Correlation is, thus, the standardized version of covariance

  • A correlation coefficient is a single numeric value representing the degree to which two variables are associated with one another

  • Because correlation is a standardized effect size measure, correlation coefficients are bounded by –1 and +1

  • The sign indicates the direction of the association, while the magnitude of the measure indicates the strength of the association

  • \(|1|\) = perfect relationship; 0 = no relationship

Correlations Considerations

  • The correlation formula above is for the Pearson Product-Moment Correlation Coefficient between two continuous variables, but there are others (which we don’t discuss)

    • Rank-ordered variables (1st, 2nd, 3rd) = Spearman’s rank correlation
    • Dichotomous (e.g., [0,1]) measures = \(\phi\) (Phi) correlation
    • One dichotomous, one continuous = point-biserial correlation
    • etc.
  • \(\rho\) or its estimate \(r\) do not provide a complete description of the two variables; you should always provide means and standard deviations.

  • Correlation measures the strength of the linear relationship between x and y only; it’s inappropriate to use a correlation to describe nonlinear relationships

  • Pearson’s correlation assumes that both the variables are normally distributed and that the spread of the scores on one variable is constant across the other

  • We can check all of that, and it’s ALWAYS a good idea to visualize the association (recall the first step in Modeling Steps 👣?)

Visualizing Relationships with Scatterplots

library(ggplot2) # load the package
ggplot(data, aes(x = age, y = height)) +
  geom_point(size=3, color="deepskyblue3") +
  labs(title = "Scatter Plot of Age vs. Height", x = "Age (years)", y = "Height (cm)") + theme_minimal()

Visualizing Relationships with Scatterplots

How would you describe the relationship between age and height? Can you guess the correlation coef?

Interpreting Scatterplots

  • Identify the general pattern of the observations
    • The pattern can be described by the form (non-/linear?), direction (goes up/down), and strength of the relationship (are the dots tightly clustered around a discernible trend line/curve?)
  • Identify obvious deviations from the pattern
    • Observations that clearly deviate from the overall pattern may be outliers

In R

We could do this ourselves (first block), or let R do this for us!

cov_ah <- cov(age, height)
sd_age <- sd(age) ; sd_height <- sd(height)
cov_ah/(sd_age*sd_height)
[1] 0.7824843

which is the same as…

cor(age, height)
[1] 0.7824843

Guess the Correlation

Answer

Guess the Correlation

Answer

Guess the Correlation

Answer

Guess the Correlation

Answer

The Importance of Visualizing Data

Effect Sizes

\(r\)

  • A correlation coefficient is an effect size

  • It describes the magnitude and direction of the effect (association between two variables)

  • According to conventional benchmarks (based on Cohen’s rules of thumb):

    • \(r \approx 0.1\) or smaller is a small effect
    • \(r \approx 0.3\) is a medium effect
    • \(r \approx 0.5\) or larger is a large effect

Effect Sizes

\(r^2\)

  • Another way to express the magnitude of the effect is to square correlation to get the coefficient of determination, \(r^2\)

  • The coefficient of determination provides the proportion of variance in one variable that is shared or accounted for by the other

Effect Sizes

Partial-\(r^2\) & Semi-partial-\(r^2\)
  • Used when interested in the relationship between two variables, controlling for the effects of a third variable

    • Partial-\(r^2\) capture the relationship between X and Y, controlling for the effect that a third variable (Z) has on both X and Y.

    • Semi-partial-\(r^2\) capture the relationship between X and Y controlling for the effect that Z has only on Y (remember for multiple regression!)

Effect Sizes in R

Returning to our previous example of age and height

cor(age, height) # correlation coefficient, r
[1] 0.7824843
  • \(r=0.78\), a rather large effect size—a strong, positive association



cor(age, height)^2 # coefficient of determination, r^2
[1] 0.6122817
  • \(r^2=0.61\) means that age and height share about 61% of the variance, that’s a lot! (but, also expected, right?)

Effect Sizes in R

When adding a third variable, weight:

library(ppcor)
spcor(data)$estimate # semi-partial-r^2
          height       age    weight
height 1.0000000 0.4201188 0.1302424
age    0.3757845 1.0000000 0.3018020
weight 0.1384063 0.3585575 1.0000000


  • Semi-partial-\(r^2=0.38\) means that when “controlling” for weight (or holding weight constant), age and height share about 38% of their variability

  • Or, you can say that about 38% of the total variability in height is shared uniquely with age

Statistical Inference and Hypothesis Testing

Say we’re interested in testing whether there is no correlation between two random variables X and Y, we test the null

\[H_0: \rho = 0\]

using the t ratio:

\[t(df)=\frac{r-\rho_0}{SE_r}=\frac{r-0}{\sqrt{\frac{1-r^2}{N-2}}}=\frac{r\sqrt{N-2}}{\sqrt{1-r^2}}\]

with \(df = N − 2\) degrees of freedom, where r is the observed correlation, \(\rho_0\) is the specified correlation value under the null (i.e., 0), and N is the sample size

  • Rejecting this null would indicate that the r you observed was “surprising” given the position of ignorance, \(\rho = 0\)

  • Might not be very impressive (e.g., large sample size), so probably want to think in terms of CIs (what is \(\rho\) likely to be?)

Applied Research Example

A researcher collected data from 275 undergraduate participants and hypothesized a relationship between aggression and impulsivity. She measured aggression using the Buss Perry Aggression Questionnaire (BPAQ) and impulsivity using the Barratt Impulsivity Scale (BIS)


Load the data in R

library(haven)
agrsn <- read_sav("aggression.sav")

Descriptive Statistics and Graphs

Examine descriptive information about your variables, e.g., means, SDs, histograms/boxplots and scatterplots


library(misty)
library(tidyverse)
agrsn %>% select(BPAQ, BIS) %>% descript() # using the pipe operator, %>%, which reads "and then..." 
 Descriptive Statistics

  Variable   n nNA   pNA    M   SD  Min  Max Skew  Kurt
   BPAQ    275   0 0.00% 2.61 0.52 1.34 4.03 0.01 -0.38
   BIS     275   0 0.00% 2.28 0.35 1.42 3.15 0.36 -0.19

Descriptive Statistics and Graphs

ggplot(agrsn, aes(x = BPAQ, y = BIS)) +
  geom_point(size=3, color="deepskyblue3") +
  geom_smooth(method = "lm", colour= "black", linewidth=2)+ # Corr/Reg line
  geom_smooth(colour= "purple", linewidth=2, linetype="dashed", se=F)+ # LOESS LINE
  theme_minimal()

Understanding the Plot

  • The dashed, purple line is the LOESS (Locally Estimated Scatterplot Smoothing) curve, a non-parametric method that fits a smooth curve to data, capturing non-linear, local trends in the data without assuming a predefined model
    • Why is it important? It can tell us if, where, and how there’s a deviation from linearity!
  • The solid, black line is the regression line, the slope of which represent the direction and strength of the correlation
    • The gray band corresponds to the uncertainty, the 95% confidence band around the regression line which shows the range within which we expect the true regression line to lie (95% of the time, had we conducted this analyses repeatedly, each time with a different sample of the same size)

Hypothesis Testing in R

cor.test(agrsn$BPAQ, agrsn$BIS)

    Pearson's product-moment correlation

data:  agrsn$BPAQ and agrsn$BIS
t = 5.5939, df = 273, p-value = 5.391e-08
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2103747 0.4229210
sample estimates:
      cor 
0.3206789 
cor(agrsn$BPAQ, agrsn$BIS)^2 # coefficient of determination, r^2
[1] 0.102835

Interpretation and Reporting

When reporting, we should include:

  • Sample size
  • Effect size (\(r = .32\) & \(r^2 = .1\))
  • Uncertainty (95% CI)
  • Interpretation of effect size and uncertainty
  • Analyses results (t and p values with \(df\))
  • If known, bring back to research question and connect to larger lit. body.

Example:

“In our sample of \(N = 275\) undergraduate students, there was a statistically significant relationship between scores on the BPAQ and BIS (\(r = 0.32\), 95% CI \([0.21, 0.42]\), \(t(273) = 5.59\), \(p < 0.001\)). These results suggest a moderate association between aggression and impulsivity. The narrow confidence interval (\([0.21, 0.42]\)) indicates a relatively precise estimate. Furthermore, aggression and impulsivity share about 10% of their variance (\(r^2 = 0.10\)), which is a modest yet meaningful amount.”

Simple Linear Regression

Remember this slide from Module 1? 📈

Mathematically, the GLM can be expressed like this:

\[y_i = \beta_0 + \beta_1 x1_{i} + \beta_2 x2_{i} + \dots + \beta_p xp_{i} + \epsilon_i\]

where

  • \(Y_i\): Outcome variable (sometime called criterion, response, or DV) for participant i
  • \(\beta_0\): Intercept
  • \(\beta_1, \dots, \beta_p\): Coefficients (sometimes called regression coefficients, slopes or partial slopes, effects or fixed effects, estimates/parameters, or betas)
  • \(x1_i, x2_i, \dots, xp_i\): Predictor variables for ppt i (sometimes called explanatory, regressors, covariates or IV)
  • \(\epsilon_i\): Error term

Simple Linear Regression (SLR)

To start, we’re going to focus on simple linear regression, meaning that we have one IV (predictor) and one DV (outcome)

Or, less angrily, the SLR looks like this:

\[y_i = \beta_0 + \beta_1 x1_{i} + \epsilon_i\]

A regression model is a formal model for expressing the tendency of the outcome variable, Y, to vary conditionally on the predictor variable, X.


This SLR model has 3 parameters and 3 variables, can you identify two of each?

Simple Linear Regression (SLR)

Population:

\[y_i = \beta_0 + \beta_1 x1_{i} + \epsilon_i\]

Sample: \[y_i = \hat{\beta}_0 + \hat{\beta}_1 x1_{i} + e_i\]

  • The variables in this model are:
    • The outcome variable, \(y_i\)
    • The predictor variable \(x1_i\)
    • The residual, \(e_i\)
  • The parameters in this model are the intercept (\(\beta_0\)), slope (\(\beta_1\)), and \(\sigma_{\epsilon}\), which is the SD of the errors, and tells us how closely, on average, the model predicts each \(y_i\) observation
    • \(\sigma_{\epsilon}\) is estimated after we obtain the \(\beta\)s via OLS; it’s added to the model so we can get SEs, CIs, and p values for our estimates
    • But, with SLR—much like in correlation—we focus primarily on the relationship between the variables (i.e., the slope parameter) and \(r^2\)

SLR’s Parameters & Variables 🎨

\[y_i = \hat{\beta}_0 + \hat{\beta}_1 x1_{i} + e_i\]

Regression(s) Consideration(s)

  • The linear regression model is designed to work with a continuous outcome variable

  • The residuals, \(e\), represent the inaccuracy of the model’s ability to reproduce (i.e., predict/explain) the value of \(y_i\) for a given person

  • \(\epsilon\) (and thus \(e\)) is assumed (for proper SE estimates, CIs, p values, etc.) to be normally distributed (across all levels of X) with a mean of 0 (the SD/variance is estimated)

    • which is written as \(e_i \sim N(0, \sigma^2_{\epsilon})\) or, alternatively, \(y_i \sim N(\hat{y}_i, \sigma^2_{\epsilon})\)
  • \(Y\) and \(\epsilon\) (and thus \(e\)) are assumed to be uncorrelated random variables, and X is assumed to be an error free (yeah right…) component that we are using to predict values in Y

  • Parameters without the i subscript are constants—they do not vary across observations—\(\beta_0\) and \(\beta_1\) have one value for all observations/individuals

  • Regression models are all about conditional expectations (i.e., conditional means):

\[E(y_i|x1_i) = \beta_0 + \beta_1 x1_i\]

\(\beta\)s Interpretation

  • The intercept, \(\beta_0\), is the expected value on Y for a hypothetical observation with \(x1_i=0\)

  • The slope measures the strength of the linear relationship between X and Y and indicates that: a one unit increase on X results in an expected/predicted change of \(\beta_1\) in Y


Example: say that you were studying the relationship between a final grade in PSYC 3032 (in %) and the combined score on the assignments (out of a total of 50):

  • You discover the relationship is

\[E( final| assignments) = \beta_0 + \beta_1 \times assignments\] \[E( final| assignments) = 42 + 1.16\times assignments\]

What this means: \(\beta_0 = 42\) indicates that a person who scored 0 on the assignments is expected/predicted to finish the course with 42%, while \(\beta_1 = 1.16\) means that every 1-point increase on the assignments is expected to increase the final grade by 1.16%

Model Predictions

What about for specific individual; for example, someone who obtained 40/50? Notice below that the value of 40 is put into the equation, because the \(\beta_1\) is with respect to the raw assignments score.

\[E( final| assignments) = \hat{y}_i= 42 + 1.16\times 40 = 88.4\]

In other words, that individual is expected to obtain 88.4% as their final grade, given their assignment scores (how close this average guess is depends on how good the model is at prediction!).

\(\hat{y_i}\) is called the predicted value for the ith observation; so, for that individual above the predicted/expected value, given their score on the assignment is, \(\hat{y_i}=88.4\).

  • But, let’s say that individual actually got 91% as their final grade; we can calculate the residual (error term) for that individual, \(e_i = y_i - \hat{y_i}= 91-88.4=2.6\) (the difference between what was actually observed versus what was expected under the model)

…Let’s look again at the SLR plot

Model Estimation

The blue line, which is specified by our best \(\beta_0\) and \(\beta_1\) estimates, is called the “Line of Best Fit,” and it “cuts” right through all the observation. But how can find what it is!?

Ordinary Least Squares (OLS)

OLS is a mathematical solution for finding out the “best” parameter estimates of a linear regression model.


But, what does “best” even mean?

“Best” parameter estimates means that the model (i.e., the regression line) gives us the most accurate prediction/explanation power! (in our sample)


As its name suggests, OLS gives us the values for the intercept and slope(s) that yield the minimum (least) squares (squared residuals) (i.e., “best!”).


We can square the residuals and find their sum. But, wait! We would need to know the estimate for \(\beta_0\) and \(\beta_1\) to get the residuals, right? Recall, \[e_i = y_i-\hat{y}_i=y_i-(\beta_0+\beta_1 x1_i),\]

So what do we do!?

Ordinary Least Squares (OLS)

We can “build” a function from the calculation of all \(e\)s in the sample, but leave two unknowns (intercept and slope):

\[\sum_{i=1}^N{e^2}=\sum_{i=1}^N{(y_1-\hat{y}_i)^2}=\sum_{i=1}^N{(y_i-[\beta_0+\beta_1 x1_i])^2}\]

Now, with a little bit of calculus and linear algebra, we can find the solution (the minimum of the function):

  • Differentiate \(\sum{e^2}\) with respect to \(\beta_0\) and \(\beta_1\)
  • Set these partial derivatives equal to 0, and solve each equation to find each parameter



Check in question

Why do we square the residuals?

Ordinary Least Squares (OLS)

Don’t worry too much about the math, all you need to know is that there is an “easy” solution:

\[\hat{\beta}_1=\frac{COV(X,Y)}{VAR(X)} =\frac{\sum_{i=1}^N (x1_i - \bar{x})(y_i-\bar{y})}{\sum_{i=1}^N (x1_i - \bar{x})^2}=r \times \frac{SD(Y)}{SD(X)}\]

With \(\hat{\beta}_1\), we can solve for the intercept:

\[\hat{\beta}_0=\bar{y}-\hat{\beta}_1 \bar{x}\]

And, \(\hat{\sigma}_{\epsilon}\), which is the estimate of \(\sigma_{\epsilon}\) and called the residual SE, can be calculated as:

\[\hat{\sigma}_{\epsilon} = \sqrt{\frac{\sum (y_i-\hat{y}_i)^2}{N-2}}= \sqrt{\frac{\sum e^2}{df}}\]

Check in Question

If both X and Y are standardized, what does \(\hat{\beta}_1\) equal?

Statistical Inference and Hypothesis Testing

Often researchers seek to test the statistical significance of the slope parameter and of the proportion of variability in the outcome it shares/explains.

Specifically, we test the following null and alternative hypotheses:

\[H_0:\beta_1=0\] \[H_1:\beta_1\neq0\]

Rejecting this null, \(H_0:\beta_1=0\), would indicate that the \(\hat{\beta}_1\) you found was “surprising” given the position of ignorance, \(\hat{\beta}_1 = 0\); that the population relationship between X and Y is unlikely to be zero…

Statistical Inference and Hypothesis Testing

…These hypotheses are tested using a type of t test. Specifically, each parameter has its own standard error such that we can calculate a ratio of the effect (estimated parameter) over noise (the standard error) and get a p value associated with the t statistic.

\[t(df)=\frac{\hat{\beta}_1-B}{SE_{\hat{\beta_1}}}\]

where B is the value of \(\beta\) under the null; some constant to be testing against (often, B = 0, which is the default in most software)

It is also easy to obtain CIs for the individual parameters using the following equation:

\[CI_{(1-\alpha)100\%}=\hat{\beta}_1 \pm t_{(1 − \alpha /2, \ df)} \times SE_{\hat{\beta_1}}\]

where \(\alpha\) is the nominal Type I error rate, and \(t_{(1 − \alpha /2, \ df)}\) is the critical value for the t dist. with \(df\) degrees of freedom and \(\alpha\) level of significance

Effect Sizes in Simple Regression

The effect sizes of interest in SLR are the regression analogues of the effect sizes of interest in correlation.

  • Regression slope/coefficient: unstandardized regression coefficients are easy to understand because they have a clear metric.
    • Because they represent the predicted change in the outcome variable for each 1-unit change in the predictor variable, they provide information about the strength of their association
    • Standardized slope estimates are a different parameterization which also could be used as an effect size. In simple regression, a standardized slope is just the correlation.
  • Shared variance, \(r^2\): the proportion of variability in Y explained/predicted by X

Applied Research Example

We will revisit the research example from last week examining the relationship between impulsivity and aggression: 275 undergraduates completed a questionnaire assessing scores on the BPAQ and BIS scales among others.

Ultimately, the researcher is interested in predicting aggression from impulsivity

At a very simple level, the researcher wants to devise a model for aggression (operationalized with BPAQ scores) to explain or predict how and why people vary on this variable, given their BIS score.

Let’s revisit the example from last week to highlight more of a regression lens as opposed to correlation.


Load the data in R

library(haven)
agrsn <- read_sav("aggression.sav")

Descriptive Statistics and Graphs

Examine descriptive information about your variables, e.g., means, SDs, histograms/boxplots and scatterplots


library(misty)
library(tidyverse)
agrsn %>% select(BPAQ, BIS) %>% descript() # using the pipe operator, %>%, which reads "and then..." 
 Descriptive Statistics

  Variable   n nNA   pNA    M   SD  Min  Max Skew  Kurt
   BPAQ    275   0 0.00% 2.61 0.52 1.34 4.03 0.01 -0.38
   BIS     275   0 0.00% 2.28 0.35 1.42 3.15 0.36 -0.19

Descriptive Statistics and Graphs

ggplot(agrsn, aes(x = BPAQ, y = BIS)) +
  geom_point(size=3, color="deepskyblue3") +
  geom_smooth(method = "lm", colour= "black", linewidth=2)+ # Corr/Reg line
  geom_smooth(colour= "purple", linewidth=2, linetype="dashed", se=F)+ # LOESS LINE
  theme_minimal()

Hypothesis Testing in R

SLR.mod <- lm(formula= BPAQ ~ BIS, data=agrsn)
summary(SLR.mod)

Call:
lm(formula = BPAQ ~ BIS, data = agrsn)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.14134 -0.30470  0.00845  0.35500  1.35527 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.5217     0.1973   7.713 2.31e-13 ***
BIS           0.4777     0.0854   5.594 5.39e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4972 on 273 degrees of freedom
Multiple R-squared:  0.1028,    Adjusted R-squared:  0.09955 
F-statistic: 31.29 on 1 and 273 DF,  p-value: 5.391e-08
confint(SLR.mod)
                2.5 %    97.5 %
(Intercept) 1.1333249 1.9101360
BIS         0.3095758 0.6458088

Interpretation and Reporting

When reporting, we should include:

  • Sample size
  • Effect size (\(\hat{\beta}_1 = 0.4777\) & \(multiple-r^2 = 0.1028\))
  • Uncertainty (95% CI)
  • Interpretation of effect size and uncertainty
  • Analyses results (t and p values with \(df\))
  • If known, bring back to research question and connect to larger lit. body.

Example:

“In our sample of \(N = 275\) undergraduate students, impulsivity (BIS) was found to predict aggression (BPAQ). For every 1-point increase on BIS, aggression (BPAQ) was predicted to increase by approximately 0.48 points (\(\hat{\beta}_1 = 0.48\), 95% CI \([0.31, 0.65]\)). This association was statistically significant (\(t(273) = 5.59\), \(p < 0.001\)). The narrow confidence interval suggests a rather precise estimate of the effect size. Furthermore, impulsivity explained about 10% of the variability in aggression (\(R^2 = 0.10\)), indicating a modest yet meaningful proportion.”